Inventi Impact: Audio, Speech & Music Processing

Articles

Inventi:easm/80571/24

Wav2wav: Wave-to-Wave Voice Conversion

01-Oct-2024 Research 2024 : October-December

Changhyeon Jeong, Hyung-pil Chang, In-Chul Yoo, Dongsuk Yook

Voice conversion is the task of changing the speaker characteristics of input speech while preserving its linguistic content. It can be used in various areas, such as entertainment, medicine, and education. The quality of the converted speech is crucial for voice conversion algorithms to be useful in these various applications. Deep learning-based voice conversion algorithms, which have been showing promising results recently, generally consist of three modules: a feature extractor, feature converter, and vocoder. The feature extractor accepts the waveform as the input and extracts speech feature vectors for further processing. These speech feature vectors are later synthesized back into waveforms by the vocoder. The feature converter module performs the actual voice conversion; therefore, many previous studies separately focused on improving this module. These works combined the separately trained vocoder to synthesize the final waveform. Since the feature converter and the vocoder are trained independently, the output of the converter may not be compatible with the input of the vocoder, which causes performance degradation. Furthermore, most voice conversion algorithms utilize mel-spectrogram-based speech feature vectors without modification. These feature vectors have performed well in a variety of speech-processing areas but could be further optimized for voice conversion tasks. To address these problems, we propose a novel wave-to-wave (wav2wav) voice conversion method that integrates the feature extractor, the feature converter, and the vocoder into a single module and trains the system in an end-to-end manner. We evaluated the efficiency of the proposed method using the VCC2018 dataset.

How to Cite this Article
Attribution/ CC Compliant Citation: Jeong, Changhyeon, et al. "Wav2wav: Wave-to-Wave Voice Conversion." Applied Sciences 14.10 (2024): 4251. https://doi.org/10.3390/app14104251 http://creativecommons.org/licenses/by/4.0/ Some formatting elements, header, footer, logos, dates and pagination were modified while adapting this article.
Download Full Text

Call Us: +4 (800) 888-0008

Inventi Impact: Audio, Speech & Music Processing

Articles

Inventi:easm/80571/24

Wav2wav: Wave-to-Wave Voice Conversion

How to Cite this Article

Links

Contact Us